Integrating HMM-Based Speech Recognition With Direct Manipulation In A Multimodal Korean Natural Language Interface
نویسندگان
چکیده
This paper presents a HMM-based speech recognition engine and its integration into direct manipulation interfaces for Korean document editor. Speech recognition can reduce typical tedious and repetitive actions which are inevitable in standard GUIs (graphic user interfaces). Our system consists of general speech recognition engine called ABrain 1 and speech commandable document editor called SHE 2. ABrain is a phoneme-based speech recognition engine which shows up to 97% of discrete command recognition rate. SHE is a EuroBridge widget-based document editor that supports speech commands as well as direct manipulation interfaces.
منابع مشابه
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean
A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a word level, which is obviously inadequate for morphologically complex agglutinative language...
متن کاملEucalyptus: Integrating Natural Language Input with a Graphical User Interface
This report describes Eucalyptus, a natural language (NL) interface that has been integrated with the graphical user interface of the KOALAS Test Planning Tool, a simulated Naval air combat command system. The multimodal, multimedia interface handles both imperative commands and database queries (either typed or spoken into a microphone) while still allowing full use of the original graphical i...
متن کاملSpeech Interfaces to Virtual Reality
In this paper, we consider how speech interfaces can be combined with a direct manipulation interface to virtual reality. We outline the beneets of adding a speech interface, the requirements it imposes on speech recognition, language processing and interaction design. We describe the multimodal DIVERSE system which provides a speech interface to virtual worlds modelled in DIVE. This system can...
متن کاملSpeech and Language Processing for Multimodal Human-Computer Interaction
In this paper, we describe our recent work at Microsoft Research, in the project codenamed Dr. Who, aimed at the development of enabling technologies for speech-centric multimodal human-computer interaction. In particular, we present in detail MiPad as the first Dr. Who's application that addresses specifically the mobile user interaction scenario. MiPad is a wireless mobile PDA prototype that ...
متن کاملSpeech Input in Multimodal Environments: A Proposal to Study the Effects of Reference Visibility, Reference Number, and Task Integration
A model of complimentary behavior has been suggested based on arguments that direct manipulation and speech recognition interfaces have complimentary strengths and weaknesses. Specifically, anecdotal arguments have been given that direct manipulation interfaces are best used for specifying simple actions when all references are visible and the number of references is limited, while speech recog...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره cmp-lg/9611005 شماره
صفحات -
تاریخ انتشار 1996